-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Incorrect use of partial in TweedieDistribution._rowwise_gradient_hessian
#889
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for finding fixing this! I believe there is a similar mistake here:
glum/src/glum/_distribution.py
Line 706 in 9c0a221
f = partial(inv_gaussian_log_eta_mu_deviance, p=self.power) |
Could you please address it too so that they are in the same PR?
Also, I think it'd be worth adding a test case that would have spotted these bugs (maybe here?), but I'm also o kay with doing it later.
Hm, it might be a bit more involved than that. Is it possible that @@ -276,12 +276,12 @@ def inv_gaussian_log_rowwise_gradient_hessian(
inv_mu = 1 / mu[i]
inv_mu2 = inv_mu ** 2
- gradient_rows_out[i] = 2 * weights[i] * (inv_mu - y[i] * inv_mu2)
- hessian_rows_out[i] = 2 * weights[i] * (inv_mu - 2 * y[i] * inv_mu2)
+ gradient_rows_out[i] = weights[i] * (y[i] * inv_mu2 - inv_mu)
+ hessian_rows_out[i] = weights[i] * (2 * y[i] * inv_mu2 - inv_mu) instead? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you :)
@bili2002, do you mind if I push the fix for those formulas to this branch before merging this PR? |
@stanmart Sure, go ahead with the formulas! |
@lbittarello, could you please check if my changes make sense? |
Also, if it's correct, I believe the relevant part of quantco/objectives needs to be updated, too. |
Co-authored-by: Luca Bittarello <15511539+lbittarello@users.noreply.github.com>
assert_allclose(glm.intercept_, glmnet_intercept, rtol=1e-3) | ||
assert_allclose(glm.coef_, glmnet_coef, rtol=1e-3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what happens if you use the full Hessian? Do the tests pass? Are the coefficients closer to glmnet
's?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The glmnet test passes and the coefficients are roughly as close as in the other case. Another test (tests/glm/test_glm.py::test_glm_family_argument[inverse.gaussian-fam4]
) fails with a singular matrix LinalgError
, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failing test might not be an issue though as I believe the model was originally meant to be penalized, which but it is not since glum v3. 🤔 Anyways, I think it needs a bit more thinking and testing if we want to go with the true Hessian. I'd vote for merging this fix relatively soon, and possibly switching to the true Hessian later on if we convince ourselves that it works.
This PR addresses issue #888 by fixing the incorrect use of partial in the
TweedieDistribution._rowwise_gradient_hessian
function.Checklist
CHANGELOG.rst
entry